NLP Architect integrated the Transformer models available in pytorch-transformers. Using Transformer models based on a pre-trained models usually done by attaching a classification head on the transformer model and fine-tuning the model (transformer and classifier) on the target (down-stream) task.
TransformerBase is a base class for handling
loading, saving, training and inference of transformer models.
The base model support pytorch-transformers configs, tokenizers and base models as documented in their website (see our base-class for supported models).
In order to use the Transformer models just sub-class the base model and include:
- A classifier (head) for your task.
- sub-method handling of input to tensors used by model.
- any sub-method to evaluate the task, do inference, etc.
Available transformer family models in NLP Architect:
TransformerSequenceClassifier is a transformer model with sentence classification head (the
[CLS] token is used as classification label) for sentence classification tasks (classification/regression).
nlp_architect.procedures.transformers.glue for an example of training sequence classification models on GLUE benchmark tasks.
Training a model on GLUE tasks, using BERT-base uncased base model:
nlp_architect train transformer_glue \ --task_name <task name> \ --model_name_or_path bert-base-uncased \ --model_type bert \ --output_dir <output dir> \ --evaluate_during_training \ --data_dir </path/to/glue_task> \ --do_lower_case
Running a model:
nlp_architect run transformer_glue \ --model_path <path to model> \ --task_name <task_name> \ --model_type bert \ --output_dir <output dir> \ --data_dir <path to data> \ --do_lower_case \ --overwrite_output_dir
To run evaluation on the task’s development set add the flag
to the command line.